Cache Hierarchies in the Standard Library
The extreme reduction in the number of lines required in the configuration file comes at the cost of another layer of indirection to the core of the gem5 simulator. So let's take a closer look at the cache hierarchies module.
You can find this module in $GEM_PATH/src/python/gem5/components/cachehierarchies/
where the standard library defines a number of easy-to-use cache hierarchies for people who are more concerned about other things than the exact functionality of the cache. If we go ahead and look at the PrivateL1CacheHierarchy
class ($GEM_PATH/python/gem5/components/cachehierarchies/classic/private_l1_cache_hierarchy.py
), we can start dissecting how the standard library interfaces with what we know from our basic systems that we built in our basic examples.
Structure of a Cache Hierarchy
We notice that this PrivateL1CacheHierarchy
class inherits from the AbstractClassicCacheHierarchy
, which we will examine later, but observe how the structure of the cache hierarchy is almost identical to our homebrew one that we made in the basic examples using individual caches directly and linking them together in the CPU. Looking at the code the only important method that we did not implement in our basic examples is the incorporate_cache
method.
The incorporate cache method ensures that the caches are connected to the CPU on our board, and it replaces the majority of the code we wrote in our basic configurations:
We begin with the boilerplate:
@overrides(AbstractCacheHierarchy)
def incorporate_cache(self, board: AbstractBoard) -> None:
# Set up the system port for functional access from the simulator.
board.connect_system_port(self.membus.cpu_side_ports)
for _, port in board.get_mem_ports():
self.membus.mem_side_ports = port
We do two things here:
- Connect the system port, which does not make a tangible difference to us
- Connect the system board memory to the last-level cache
Then we get into actually creating our caches:
###
# INITIALIZE THE CACHES
###
self.l1icaches = [
L1ICache(size=self._l1i_size)
for i in range(board.get_processor().get_num_cores())
]
self.l1dcaches = [
L1DCache(size=self._l1d_size)
for i in range(board.get_processor().get_num_cores())
]
# ITLB Page walk caches
self.iptw_caches = [
MMUCache(size="8KiB")
for _ in range(board.get_processor().get_num_cores())
]
# DTLB Page walk caches
self.dptw_caches = [
MMUCache(size="8KiB")
for _ in range(board.get_processor().get_num_cores())
]
This is exactly the same as what we did in the basic systems.
Finally, we connect the CPU's to the caches by setting their ports:
if board.has_coherent_io():
self._setup_io_cache(board)
for i, cpu in enumerate(board.get_processor().get_cores()):
cpu.connect_icache(self.l1icaches[i].cpu_side)
cpu.connect_dcache(self.l1dcaches[i].cpu_side)
self.l1icaches[i].mem_side = self.membus.cpu_side_ports
self.l1dcaches[i].mem_side = self.membus.cpu_side_ports
self.iptw_caches[i].mem_side = self.membus.cpu_side_ports
self.dptw_caches[i].mem_side = self.membus.cpu_side_ports
cpu.connect_walker_ports(
self.iptw_caches[i].cpu_side, self.dptw_caches[i].cpu_side
)
The rest of the file simply serves to handle some weirdness in X86, where we need extra request and response ports depending on the operations being performed.
if board.get_processor().get_isa() == ISA.X86:
int_req_port = self.membus.mem_side_ports
int_resp_port = self.membus.cpu_side_ports
cpu.connect_interrupt(int_req_port, int_resp_port)
else:
cpu.connect_interrupt()
Where do the Caches Come From?
The gem5 standard library also defines the individual caches in a subdirectory in this same file. They look exactly the same as the ones that we made in the basic examples, but you can feel free to check them out!